Uniformization in Markov Decision Processes
نویسندگان
چکیده
Continuous-time Markov decision processes (CTMDP) may be viewed as a specialcase of semi-Markov decision processes (SMDP) where the intertransition times are exponen-tially distributed and the decision maker is allowed to choose actions whenever the systemstate changes. When the transition rates are identical for each state and action pair, one canconvert a CTMDP into an equivalent discrete-time Markov decision process (DTMDP), whichis easier to analyze and solve. In this article, we describe uniformization that uses fictitioustransitions from a state to itself and hence enables the conversion of a CTMDP with nonidenti-cal transition rates into an equivalent DTMDP. We first demonstrate the use of uniformizationin converting a continuous-time Markov chain into an equivalent discrete-time Markov chain,and then describe how it is used in the context of CTMDPs with discounted reward criterion.We also present examples for the use of uniformization in continuous-time Markov models.
منابع مشابه
Numerical Solution of Non-Homogeneous Markov Processes through Uniformization
Numerical algorithms based on uniformization have been proven to be numerically stable and computationally at tractive to compute transient state distributions in ho mogeneous continuous time Markov chains Recently Van Dijk van Dijk formulated uniformization for non homogeneous Markov processes and it is of interest to investigate numerical algorithms based on uniformiza tion for non homogeneou...
متن کاملOn essential information in sequential decision processes
This paper provides sufficient conditions when certain information about the past of a stochastic decision processes can be ignored by a controller. We illustrate the results with particular applications to queueing control, control of semi-Markov decision processes with iid sojourn times, and uniformization of continuous-time Markov decision processes.
متن کاملSolving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing M...
متن کاملSolving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing M...
متن کاملFour Canadian Contributions to Stochastic Modeling
We outline the history, significance, and impact of four important contributions by Canadian researchers to stochastic modeling for operational research: the use of the uniformization method to compute transient probabilities for Markov chains, pioneered by Winfried K. Grassmann, contributions to Markov decision processes by Martin L. Puterman, contributions to the development of random number ...
متن کامل